Differentiable Memory
نویسندگان
چکیده
We describe a new spatio-temporal video autoencoder, based on a classic spatial image autoencoder and a novel nested temporal autoencoder. The temporal encoder is represented by a differentiable visual memory composed of convolutional long short-term memory (LSTM) cells that integrate changes over time. Here we target motion changes and use as temporal decoder a robust optical flow prediction module together with an image sampler serving as built-in feedback loop. The architecture is end-to-end differentiable. At each time step, the system receives as input a video frame, predicts the optical flow based on the current observation and the LSTM memory state as a dense transformation map, and applies it to the current frame to generate the next frame. By minimising the reconstruction error between the predicted next frame and the corresponding ground truth next frame, we train the whole system to extract features useful for motion estimation without any supervision effort. We believe these features can in turn facilitate learning high-level tasks such as path planning, semantic segmentation, or action recognition, reducing the overall supervision effort.
منابع مشابه
Endomorphisms of Banach algebras of infinitely differentiable functions on compact plane sets
This note is a sequel to [7] where we investigated the endomorphisms of a certain class of Banach algebras of infinitely differentiable functions on the unit interval. Start with a perfect, compact plane setX. We say that a complex-valued function f defined on X is complex-differentiable at a point a ∈ X if the limit f (a) = lim z→a, z∈X f(z)− f(a) z − a exists. We call f ′(a) the complex deriv...
متن کاملDifferentiable Functional Program Interpreters
Programming by Example (PBE) is the task of inducing computer programs from input-output examples. It can be seen as a type of machine learning where the hypothesis space is the set of legal programs in some programming language. Recent work on differentiable interpreters relaxes the discrete space of programs into a continuous space so that search over programs can be performed using gradient-...
متن کاملFrustratingly Short Attention Spans in Neural Language Modeling
Neural language models predict the next token using a latent representation of the immediate token history. Recently, various methods for augmenting neural language models with an attention mechanism over a differentiable memory have been proposed. For predicting the next token, these models query information from a memory of the recent history which can facilitate learning midand long-range de...
متن کاملScaling Memory-Augmented Neural Networks with Sparse Reads and Writes
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both space and time as the amount of memory grows — limiting their applicability to real-world domains. Here, we present an end-to-end differentiable memory...
متن کاملSpatio-temporal video autoencoder with differentiable memory
We describe a new spatio-temporal video autoencoder, based on a classic spatial image autoencoder and a novel nested temporal autoencoder. The temporal encoder is represented by a differentiable visual memory composed of convolutional long short-term memory (LSTM) cells that integrate changes over time. Here we target motion changes and use as temporal decoder a robust optical flow prediction m...
متن کامل